\documentclass{article}

\input{/usr/local/texlive/texmf-local/textPreamble}

\usepackage[font={scriptsize}]{caption}

\begin{document}

%\setlength{\parindent}{0.0in}

\title{Supplementary Materials:\\ Four Factor Imagination Theory (FFIT) Scales: Reliability, Validity, and Factor Structure}

\begin{center}
 {\bf David M. Condon$^{1}$ \& Darya L. Zabelina$^{2}$}
\end{center}

\blfootnote{\noindent $^{1}$ Correspondence about these analyses should be directed to David M. Condon (\href{mailto:david-condon@northwestern.edu}{david-condon@northwestern.edu}), Assistant Professor, Department of Medical Social Sciences, Northwestern University, Chicago, Illinois.\\
\noindent $^{2}$ Correspondence about the primary manuscript should be directed to Darya L. Zabelina (\href{mailto:Darya.Zabelina@colorado.edu}{Darya.Zabelina@colorado.edu}), Research Associate, Institute of Cognitive Science, University of Colorado, Boulder, Colorado.}

\vspace{-1.0cm}

%%%%%%%%%%%%%%%%%%
%Add Breaking Line
\begin{center}
\linethickness{0.5mm}
\line(1,0){300}
\end{center}
%%%%%%%%%%%%%%%%%%

These materials provide a supplement to the manuscript titled Four Factor Imagination Theory (FFIT) Scales: Reliability, Validity, and Factor Structure.  Note that throughout this supplement, the scales are frequently referred to by the acronym ImQ.  This is because the name of the measure(s) (and the manuscript) was changed shortly before submission to distinguish it from other measures with the same name. I have opted against re-naming all of the variables in these analyses. 

The analyses are organized around each of the studies described in the article --- in order, these are (1) the Qualitative study; (2) the Exploratory study; and (3) the Validation study. The data for these analyses are available at the SAPA-Project Dataverse at the following location (and citation):

\vspace{5mm}

Condon, D. M. \& Zabelina, D. L. (2017). Reproducibility Data: Four Factor Imagination Theory Scales. \textit{Harvard Dataverse}. \url{http://dx.doi:10.7910/DVN/HM43EW}

\begin{scriptsize}
<<Step0, eval=TRUE, echo=FALSE, results='hide', fig.show="hide", tidy=TRUE, message=FALSE, warning=FALSE>>= 
# In this first step we are simply getting things ready by installing and loading necessary packages and functions.  These are hidden in the pdf output.
rm(list = ls())
suppressPackageStartupMessages(library(psych)) 
filepathdata <- "/Users/dmc174/Drive/MPSP/MSS/2017/ImQ/data/"
@
\end{scriptsize}

\section{Study 1: Qualitative Sample}

Here we conduct very preliminary quantitative analyses based on the administration of a initial set of 87 items to a small sample (N=17).  The primary purpose of the Qualitative study was to get feedback on the general scope of the project, the item set as a whole and to identify ``weird'' items.  On the basis of this feedback, many of the original items were deleted or revised and several new items were developed.

In the following section of code, we load the data and create subsets of the items (one for each of the four proposed aspects of Imagination).
\begin{scriptsize}
<<Step1, eval=TRUE, echo=TRUE, results='markup', fig.show="hide", tidy=TRUE, message=FALSE, warning=FALSE>>= 
load(paste(filepathdata, "qualitativeSample/qualitativeSampleImQ.rdata", sep=""))

qualitativeSampleImQ <- scrub(qualitativeSampleImQ, where=c(2:88), isvalue="0")
prelim87items <- subset(qualitativeSampleImQ, select = c(F1:D20))
prelimFitems20 <- subset(prelim87items, select = c(F1:F20))
prelimCitems24 <- subset(prelim87items, select = c(C1:C24))
prelimEitems23 <- subset(prelim87items, select = c(E1:E23))
prelimDitems20 <- subset(prelim87items, select = c(D1:D20))
@
\end{scriptsize}

In the next step, we evaluate the correlations to get a sense of the extent to which the items hang together within and across aspects.
\begin{scriptsize}
<<Step2, eval=TRUE, echo=TRUE, results='markup', fig.show="hide", tidy=TRUE, message=FALSE, warning=FALSE>>= 
prelim87itemsCor <- round(cor(prelim87items, use="pairwise"), 2)
Fcor <- round(cor(prelimFitems20, use="pairwise"), 2)
Ccor <- round(cor(prelimCitems24, use="pairwise"), 2)
Ecor <- round(cor(prelimEitems23, use="pairwise"), 2)
Dcor <- round(cor(prelimDitems20, use="pairwise"), 2)
@
<<Step3, eval=TRUE, echo=FALSE, results='hide', fig.show="hide", tidy=TRUE, message=FALSE, warning=FALSE>>= 
pdf(file=paste(filepathdata, "qualitativeSample/prelim87itemsCor.pdf", sep=""))
cor.plot(prelim87itemsCor, numbers = FALSE, cex.axis=.8, n=51, main="", zlim = c(-1,1), cex = .5, show.legend=TRUE, n.legend=10, keep.par=TRUE, las=2)
dev.off()

pdf(file=paste(filepathdata, "qualitativeSample/Fitems.pdf", sep=""))
cor.plot(Fcor, numbers = TRUE, cex.axis=.8, n=51, main="", zlim = c(-1,1), cex = .5, show.legend=TRUE, n.legend=10, keep.par=TRUE, las=2)
dev.off()

pdf(file=paste(filepathdata, "qualitativeSample/Citems.pdf", sep=""))
cor.plot(Ccor, numbers = TRUE, cex.axis=.8, n=51, main="", zlim = c(-1,1), cex = .5, show.legend=TRUE, n.legend=10, keep.par=TRUE, las=2)
dev.off()

pdf(file=paste(filepathdata, "qualitativeSample/Eitems.pdf", sep=""))
cor.plot(Ecor, numbers = TRUE, cex.axis=.8, n=51, main="", zlim = c(-1,1), cex = .5, show.legend=TRUE, n.legend=10, keep.par=TRUE, las=2)
dev.off()

pdf(file=paste(filepathdata, "qualitativeSample/Ditems.pdf", sep=""))
cor.plot(Dcor, numbers = TRUE, cex.axis=.8, n=51, main="", zlim = c(-1,1), cex = .5, show.legend=TRUE, n.legend=10, keep.par=TRUE, las=2)
dev.off()
@
\end{scriptsize}

Without bothering to look at the item level results, visual inspection of the correlation plots suggests that Complexity and Emotional Valence are more clearly defined by these original 87 items than Frequency and Directedness.

\begin{figure}
\begin{floatrow}
\ffigbox{\includegraphics[width=.45\textwidth]{data/qualitativeSample/Eitems.pdf}}{\caption{Study 1 - 23 Emotional Valence items}\label{fig:EmotionalValence}}
\ffigbox{\includegraphics[width=.45\textwidth]{data/qualitativeSample/Citems.pdf}}{\caption{Study 1 - 24 Complexity items}\label{fig:Complexity}}
\end{floatrow}
\end{figure}

\begin{figure}
\begin{floatrow}
\ffigbox{\includegraphics[width=.45\textwidth]{data/qualitativeSample/Fitems.pdf}}{\caption{Study 1 - 20 Frequency items}\label{fig:Frequency}}
\ffigbox{\includegraphics[width=.45\textwidth]{data/qualitativeSample/Ditems.pdf}}{\caption{Study 1 - 20 Directedness items}\label{fig:Directedness}}
\end{floatrow}
\end{figure}

\section{Study 2: Exploratory Quantitative Analyses}

The primary goal of Study 2 was to conduct exploratory quantitative analyses on the revised 84-item set from Study 1 with the intention to winnow the item pool into a smaller, more manageable subset.  To do this, we collected data from a large sample of MTurk workers using a survey created in Qualtrics.

\subsection{Data Collection Procedures}

\noindent Here are the specific details provided on MTurk:

\begin{blockquote}
Survey Title: ``Answer a survey about your imagination''\\
Description: ``Tell us about various aspects of your imagination.''\\
Keywords: ``survey, demographics, imagination''\\
Reward per assignment: \$0.50\\
Time alloted per assignment: 1 hr\\
HIT expires in: 7 days\\
Auto-approve and pay Workers in: 3 days\\
Specify ALL the qualifications Workers must meet to work on your HITs: Masters\\
HIT approval rate: $>=$ 99\% \\
Number of HITs approved $>=$ 5000\\
Location is not India\\
Only Workers who qualify to do my HITs can preview my HITs.\\
Instructions:  ``We are conducting an academic survey about imagination. Select the link below to complete the survey. At the end of the survey, you will receive a code to paste into the box below to receive credit for taking our survey.  If you have recently completed this survey or one like it about imagination, please do not participant again.''
\end{blockquote}

Note that MTurk was only used to recruit participants.  All of the data were collected through Qualtrics.  After being directed to the Qualtrics site, participants were prompted with the following instructions: ``Thank you for participating in this survey.  If you have previously participated, you will not be reimbursed for completing the survey again.  Please choose the most appropriate response to each item by comparing yourself to others.''  To help ensure that workers did not participate more than once, the ballot-stuffing protection feature was used in Qualtrics.  

In addition to all of the exploratory items, the Qualtrics survey contained 3 items that were intended to confirm that participants were paying attention.  Participants who gave an answer that did not match the requested response were deemed to be inattentive; those who missed more than one of the ``attention check'' items were immediately exited from the survey and their data were removed.  The text given to these participants read as follows:  ``Thank you for your interest in our survey.  Unfortunately, there were several items (2 or more) where it appeared that you were not paying attention.  As a result, we have rejected your submission.''

\subsection{Clean the data}

In order to get the data into R for analysis, you first have to download the csv file from Qualtrics, then delete the 2nd row and then change the label of the duration column header.

The code below pulls in the raw data files and cleans everything up.  This includes re-ordering and re-naming variables for ease of use; analyzing the many variables inserted into the survey to ensure that participants were paying attention and dropping those participants who did not answer correctly; and ensuring that none of the MTurk participants took the survey more than once.

\begin{scriptsize}
<<Step4, eval=TRUE, echo=TRUE, results='markup', fig.show="hide", tidy=TRUE, message=FALSE, warning=FALSE>>= 
load(paste(filepathdata, "exploratorySample/exploratoryDataRawImQ.rdata", sep=""))

# Drop some of the variables we aren't using.
exploratorySampleImQ <- subset(exploratorySampleImQ, select = -c(StartDate:Progress, Finished:DistributionChannel))

# change the class for our data variables (just in case)
for(i in 1:ncol(exploratorySampleImQ)){
  exploratorySampleImQ[,i] <- as.numeric(exploratorySampleImQ[,i])
}

# drop invalid responses where no good answers were provided
dataOnly <- subset(exploratorySampleImQ, select = c(F101:D120))
valid <- apply(dataOnly, 1, function(x) sum(!is.na(x)))
exploratorySampleImQtemp <- data.frame(exploratorySampleImQ, valid)
exploratorySampleImQ <- exploratorySampleImQtemp[exploratorySampleImQtemp$valid > 0, ]

dim(exploratorySampleImQ)

# drop invalid responses where no good answers were provided
secPerItem <- exploratorySampleImQ$Duration/exploratorySampleImQ$valid
exploratorySampleImQtemp <- data.frame(exploratorySampleImQ, secPerItem)
exploratorySampleImQ <- exploratorySampleImQtemp[exploratorySampleImQtemp$secPerItem > 2, ]

dim(exploratorySampleImQ)

# re-code the attention check variables.
exploratorySampleImQ$Check1 <- ifelse(exploratorySampleImQ$Check1==4, 0, 1)
exploratorySampleImQ$Check2 <- ifelse(exploratorySampleImQ$Check2==3, 0, 1)
exploratorySampleImQ$Check3 <- ifelse(exploratorySampleImQ$Check3==5, 0, 1)

# create a new variable that sums the attention check scores.
exploratorySampleImQ$CheckSum <- rowSums(exploratorySampleImQ[,c(findcolnumber(exploratorySampleImQ, "Check1"):findcolnumber(exploratorySampleImQ, "Check3"))], na.rm=T)

# identify participants who failed 1 or more attention check.
inattentive <- exploratorySampleImQ[exploratorySampleImQ$CheckSum > 0, "randomCode"]
inattentive

# see how many each of them failed
exploratorySampleImQ[exploratorySampleImQ$randomCode %in% inattentive,"CheckSum"]

# how many of them are there?
length(inattentive)

# drop everyone who failed an attention check.
attentive <- exploratorySampleImQ[exploratorySampleImQ$CheckSum < 1, "randomCode"]
exploratorySampleImQ <- exploratorySampleImQ[exploratorySampleImQ$randomCode %in% attentive,]

exploratorySampleImQ <- subset(exploratorySampleImQ, select = -c(Duration, randomCode, Check1, Check2, Check3, CheckSum, valid, secPerItem))

dim(exploratorySampleImQ)
describe(exploratorySampleImQ, fast = TRUE)
@
\end{scriptsize}

\subsection{Stability estimates of the correlations}

In this section of code we are describing the sample size, the number of pairwise administrations among the items and generating bootstrapped estimates of the standard errors around the item-level correlations.
\begin{scriptsize}
<<Step5, eval=TRUE, echo=TRUE, results='markup', fig.show="hide", tidy=TRUE, message=FALSE, warning=FALSE>>=
# How many participants are in the sample?
dim(exploratorySampleImQ)[1]

# How many times was each item administered?
AdminsMean <- round(mean(describe(exploratorySampleImQ)$n), 0)
AdminsMean <- prettyNum(AdminsMean, big.mark=",", scientific=F)

AdminsMedian <- round(median(describe(exploratorySampleImQ)$n), 0)
AdminsMedian <- prettyNum(AdminsMedian, big.mark=",", scientific=F)

AdminsSD <- round(sd(describe(exploratorySampleImQ)$n), 0)
AdminsSD <- prettyNum(AdminsSD, big.mark=",", scientific=F)

# The mean, median and sd of administrations of items
AdminsMean
AdminsMedian
AdminsSD

pwiseAdmins1 <- pairwiseDescribe(exploratorySampleImQ)
pwiseAdmins <- prettyNum(pwiseAdmins1, big.mark=",", scientific=F)

pwiseAdminsMean <- round(pwiseAdmins1$mean, 0)
pwiseAdminsMean <- prettyNum(pwiseAdminsMean, big.mark=",", scientific=F)

pwiseAdminsMedian <- round(pwiseAdmins1$median, 0)
pwiseAdminsMedian <- prettyNum(pwiseAdminsMedian, big.mark=",", scientific=F)

pwiseAdminsSD <- round(pwiseAdmins1$sd, 0)
pwiseAdminsSD <- prettyNum(pwiseAdminsSD, big.mark=",", scientific=F)

# The mean, median and sd of pairwise administrations of items
pwiseAdminsMean
pwiseAdminsMedian
pwiseAdminsSD

expSampleImQcorCI <- corCi(exploratorySampleImQ, n.iter = 1000, plot="FALSE")

# Mean sd of correlations in the exploratory sample.
meanSDofRsE <- mean(expSampleImQcorCI$sds)
meanSDofRsE
# Median sd of correlations in the exploratory sample.
medianSDofRsE <- median(expSampleImQcorCI$sds)
medianSDofRsE
# min sd of correlations in the exploratory sample.
maxSDofRsE <- max(expSampleImQcorCI$sds)
maxSDofRsE
# max sd of correlations in the exploratory sample.
minSDofRsE <- min(expSampleImQcorCI$sds)
minSDofRsE

# Exploratory sample standard errors for item correlations
pdf(file=paste(filepathdata, "exploratorySample/expCIs.pdf", sep=""))
plot(density(expSampleImQcorCI$sds), main="", ylab="Frequency", xlab="Standard error of item level correlations", xlim= c(0,.25), ylim = c(0,30))
dev.off()
@
\end{scriptsize}

Note that the sample size is large but the item-level correlations are only moderately stable.  This is acceptable for exploratory analyses aimed at winnowing the item pool but a much larger sample will be needed to corroborate (replicate) whatever findings result from these analyses (see the SAPA sample in the next section).

\begin{figure}
\begin{center}
\ffigbox{\includegraphics[width=.5\textwidth]{data/exploratorySample/expCIs.pdf}}{\caption{Distribution of SEs of the correlations between items in Study 2 (exploratory sample)}\label{fig:expCIs}}
\end{center}
\end{figure}

\subsection{Exploratory Factor Analysis}

The hypothesized structure included 4 factors, but it is important to consider whether this solution is superior to alternative structures.
\begin{scriptsize}
<<Step6, eval=TRUE, echo=TRUE, results='markup', tidy=TRUE, message=FALSE, warning=FALSE>>= 
getOption("width")
options(width=150)
nfac <- suppressWarnings(suppressMessages(nfactors(exploratorySampleImQ, 10)))
nfac
@
\end{scriptsize}

Unfortunately, most of the fit statistics do not converge.  Of those that do, the MAP criterion and mean complexity value suggest 4 factors while VSS suggests 2.  

Here we sort the items based on loadings in the four factor solution, then pull out the items with primary loadings above abs value of 0.5 and secondary loadings below abs value of 0.2.
\begin{scriptsize}
<<Step7, eval=TRUE, echo=TRUE, results='markup', tidy=TRUE, message=FALSE, warning=FALSE>>= 
fa4 <- suppressWarnings(suppressMessages(fa(exploratorySampleImQ,4)))
fa4sort <- fa.sort(fa4)
fa4sort

fa4ldgs <- fa4sort$loadings

fa1items <- names(subset(fa4ldgs[,"MR1"], abs(fa4ldgs[,"MR1"]) > .5))
fa1items <- names(subset(fa4ldgs[fa1items,"MR2"], abs(fa4ldgs[fa1items,c("MR2")]) < .2))
fa1items <- names(subset(fa4ldgs[fa1items,"MR3"], abs(fa4ldgs[fa1items,c("MR3")]) < .21))
fa1items <- names(subset(fa4ldgs[fa1items,"MR4"], abs(fa4ldgs[fa1items,c("MR4")]) < .2))
# show the qualified items on the first factor and their loadings.
round(fa4ldgs[fa1items,],2)

fa2items <- names(subset(fa4ldgs[,"MR2"], abs(fa4ldgs[,"MR2"]) > .5))
fa2items <- names(subset(fa4ldgs[fa2items,"MR1"], abs(fa4ldgs[fa2items,c("MR1")]) < .2))
fa2items <- names(subset(fa4ldgs[fa2items,"MR3"], abs(fa4ldgs[fa2items,c("MR3")]) < .2))
fa2items <- names(subset(fa4ldgs[fa2items,"MR4"], abs(fa4ldgs[fa2items,c("MR4")]) < .2))
# show the qualified items on the second factor and their loadings.
round(fa4ldgs[fa2items,],2)

fa3items <- names(subset(fa4ldgs[,"MR3"], abs(fa4ldgs[,"MR3"]) > .5))
fa3items <- names(subset(fa4ldgs[fa3items,"MR1"], abs(fa4ldgs[fa3items,c("MR1")]) < .2))
fa3items <- names(subset(fa4ldgs[fa3items,"MR2"], abs(fa4ldgs[fa3items,c("MR2")]) < .2))
fa3items <- names(subset(fa4ldgs[fa3items,"MR4"], abs(fa4ldgs[fa3items,c("MR4")]) < .2))
# show the qualified items on the third factor and their loadings.
round(fa4ldgs[fa3items,],2)

fa4items <- names(subset(fa4ldgs[,"MR4"], abs(fa4ldgs[,"MR4"]) > .5))
fa4items <- names(subset(fa4ldgs[fa4items,"MR1"], abs(fa4ldgs[fa4items,c("MR1")]) < .2))
fa4items <- names(subset(fa4ldgs[fa4items,"MR2"], abs(fa4ldgs[fa4items,c("MR2")]) < .2))
fa4items <- names(subset(fa4ldgs[fa4items,"MR3"], abs(fa4ldgs[fa4items,c("MR3")]) < .2))
# show the qualified items on the fourth factor and their loadings.
round(fa4ldgs[fa4items,],2)
ImQitems <- c(fa1items, fa2items, fa3items, fa4items)
# total item count
length(ImQitems)
# create a subset of the data with only those items.
ImQitemSet <- subset(exploratorySampleImQ, select = c(ImQitems))
@
\end{scriptsize}

\subsection{Table the top items and their loadings}

Put it all together to see the items by factor with the loadings.  The table in this section shows those items with primary loadings above 0.5 and no secondary loadings above 0.2 (on any of the remaining 3 factors).
\begin{tiny}
<<Step8, eval=TRUE, echo=TRUE, results='markup', tidy=TRUE, message=FALSE, warning=FALSE>>= 
getOption("width")
options(width=150)
ImQtable <- data.frame(t(ImQcolnames[ImQitems]), round(fa4ldgs[ImQitems,],2))
colnames(ImQtable) <- c("item", "F1ldg", "F2ldg", "F3ldg", "F4ldg")
ImQtable
@
\end{tiny}

\noindent These are the factors: 

\begin{blockquote}
Frequency (F1 - 10 items): F101, F113, F102, D116, F107, D114, D115, F108, D112, F104\\
Emotional Valence (F2 - 9 items): EV117, EV121, EV122, EV119, EV116, EV125, EV107, EV114, EV123\\
Directedness (F3 - 6 items): D109, D101, D111, D105, D110, D107\\
Complexity (F4 - 8 items): C118, C117, C20, C11, C105, C107, C108, C2\\
\end{blockquote}

\subsection{Unidimensionality of the factors}

Here, evaluate the dimensionality of the full item set (i.e., the most highly loaded items on the four factors, not ALL the items).
\begin{scriptsize}
<<Step9, eval=TRUE, echo=TRUE, results='markup', tidy=TRUE, message=FALSE, warning=FALSE>>= 
omegaImQitems3 <- suppressWarnings(suppressMessages(omega(ImQitemSet, 3, plot=FALSE)))
omegaImQitems4 <- suppressWarnings(suppressMessages(omega(ImQitemSet, 4, plot=FALSE)))
omegaImQitems5 <- suppressWarnings(suppressMessages(omega(ImQitemSet, 5, plot=FALSE)))
omegaImQitems6 <- suppressWarnings(suppressMessages(omega(ImQitemSet, 6, plot=FALSE)))
omegaImQitemsECV <- c(omegaImQitems3$ECV, omegaImQitems4$ECV, omegaImQitems5$ECV, omegaImQitems6$ECV)
names(omegaImQitemsECV) <- c("3GF", "4GF", "5GF", "6GF")
omegaImQitemsO_h <- c(omegaImQitems3$omega_h, omegaImQitems4$omega_h, omegaImQitems5$omega_h, omegaImQitems6$omega_h)
names(omegaImQitemsO_h) <- c("3GF", "4GF", "5GF", "6GF")
omegaMeasures <- round(as.data.frame(rbind(omegaImQitemsECV, omegaImQitemsO_h)), 2)
rownames(omegaMeasures) <- c("ECV", "omega_h")
omegaMeasures$means <- round(rowMeans(omegaMeasures), 2)
omegaMeasures
alphaImQitems <- alpha(ImQitemSet, check.keys = TRUE)
unidim <- round(alphaImQitems$Unidim[[1]], 2)
alphaMeasures <- data.frame(unidim, round(alphaImQitems$total, 2))
alphaMeasures
@
\end{scriptsize}

Now, evaluate the unidimensionality of each factor in the four factor solution.

\begin{scriptsize}
<<Step10, eval=TRUE, echo=TRUE, results='markup', tidy=TRUE, message=FALSE, warning=FALSE>>= 
ImQfa1ItemSet <- subset(exploratorySampleImQ, select = c(fa1items))
ImQfa2ItemSet <- subset(exploratorySampleImQ, select = c(fa2items))
ImQfa3ItemSet <- subset(exploratorySampleImQ, select = c(fa3items))
ImQfa4ItemSet <- subset(exploratorySampleImQ, select = c(fa4items))

omegafa1items3 <- suppressWarnings(suppressMessages(omega(ImQfa1ItemSet, 3, plot=FALSE)))
omegafa1items4 <- suppressWarnings(suppressMessages(omega(ImQfa1ItemSet, 4, plot=FALSE)))
omegafa1items5 <- suppressWarnings(suppressMessages(omega(ImQfa1ItemSet, 5, plot=FALSE)))
omegafa1items6 <- suppressWarnings(suppressMessages(omega(ImQfa1ItemSet, 6, plot=FALSE)))
omegafa1itemsECV <- c(omegafa1items3$ECV, omegafa1items4$ECV, omegafa1items5$ECV, omegafa1items6$ECV)
names(omegafa1itemsECV) <- c("3GF", "4GF", "5GF", "6GF")
omegafa1itemsO_h <- c(omegafa1items3$omega_h, omegafa1items4$omega_h, omegafa1items5$omega_h, omegafa1items6$omega_h)
names(omegafa1itemsO_h) <- c("3GF", "4GF", "5GF", "6GF")
omegaMeasuresfa1 <- round(as.data.frame(rbind(omegafa1itemsECV, omegafa1itemsO_h)), 2)
rownames(omegaMeasuresfa1) <- c("ECV", "omega_h")
omegaMeasuresfa1$means <- round(rowMeans(omegaMeasuresfa1), 2)
omegaMeasuresfa1
alphafa1items <- alpha(ImQfa1ItemSet, check.keys = TRUE)
unidimfa1 <- round(alphafa1items$Unidim[[1]], 2)
alphaMeasuresfa1 <- data.frame(unidimfa1, round(alphafa1items$total, 2))
alphaMeasuresfa1

omegafa2items3 <- suppressWarnings(suppressMessages(omega(ImQfa2ItemSet, 3, plot=FALSE)))
omegafa2items4 <- suppressWarnings(suppressMessages(omega(ImQfa2ItemSet, 4, plot=FALSE)))
omegafa2items5 <- suppressWarnings(suppressMessages(omega(ImQfa2ItemSet, 5, plot=FALSE)))
omegafa2items6 <- suppressWarnings(suppressMessages(omega(ImQfa2ItemSet, 6, plot=FALSE)))
omegafa2itemsECV <- c(omegafa2items3$ECV, omegafa2items4$ECV, omegafa2items5$ECV, omegafa2items6$ECV)
names(omegafa2itemsECV) <- c("3GF", "4GF", "5GF", "6GF")
omegafa2itemsO_h <- c(omegafa2items3$omega_h, omegafa2items4$omega_h, omegafa2items5$omega_h, omegafa2items6$omega_h)
names(omegafa2itemsO_h) <- c("3GF", "4GF", "5GF", "6GF")
omegaMeasuresfa2 <- round(as.data.frame(rbind(omegafa2itemsECV, omegafa2itemsO_h)), 2)
rownames(omegaMeasuresfa2) <- c("ECV", "omega_h")
omegaMeasuresfa2$means <- round(rowMeans(omegaMeasuresfa2), 2)
omegaMeasuresfa2
alphafa2items <- alpha(ImQfa2ItemSet, check.keys = TRUE)
unidimfa2 <- round(alphafa2items$Unidim[[1]], 2)
alphaMeasuresfa2 <- data.frame(unidimfa2, round(alphafa2items$total, 2))
alphaMeasuresfa2

omegafa3items3 <- suppressWarnings(suppressMessages(omega(ImQfa3ItemSet, 3, plot=FALSE)))
omegafa3items4 <- suppressWarnings(suppressMessages(omega(ImQfa3ItemSet, 4, plot=FALSE)))
omegafa3items5 <- suppressWarnings(suppressMessages(omega(ImQfa3ItemSet, 5, plot=FALSE)))
omegafa3items6 <- suppressWarnings(suppressMessages(omega(ImQfa3ItemSet, 6, plot=FALSE)))
omegafa3itemsECV <- c(omegafa3items3$ECV, omegafa3items4$ECV, omegafa3items5$ECV, omegafa3items6$ECV)
names(omegafa3itemsECV) <- c("3GF", "4GF", "5GF", "6GF")
omegafa3itemsO_h <- c(omegafa3items3$omega_h, omegafa3items4$omega_h, omegafa3items5$omega_h, omegafa3items6$omega_h)
names(omegafa3itemsO_h) <- c("3GF", "4GF", "5GF", "6GF")
omegaMeasuresfa3 <- round(as.data.frame(rbind(omegafa3itemsECV, omegafa3itemsO_h)), 2)
rownames(omegaMeasuresfa3) <- c("ECV", "omega_h")
omegaMeasuresfa3$means <- round(rowMeans(omegaMeasuresfa3), 2)
omegaMeasuresfa3
alphafa3items <- alpha(ImQfa3ItemSet, check.keys = TRUE)
unidimfa3 <- round(alphafa3items$Unidim[[1]], 2)
alphaMeasuresfa3 <- data.frame(unidimfa3, round(alphafa3items$total, 2))
alphaMeasuresfa3

omegafa4items3 <- suppressWarnings(suppressMessages(omega(ImQfa4ItemSet, 3, plot=FALSE)))
omegafa4items4 <- suppressWarnings(suppressMessages(omega(ImQfa4ItemSet, 4, plot=FALSE)))
omegafa4items5 <- suppressWarnings(suppressMessages(omega(ImQfa4ItemSet, 5, plot=FALSE)))
omegafa4items6 <- suppressWarnings(suppressMessages(omega(ImQfa4ItemSet, 6, plot=FALSE)))
omegafa4itemsECV <- c(omegafa4items3$ECV, omegafa4items4$ECV, omegafa4items5$ECV, omegafa4items6$ECV)
names(omegafa4itemsECV) <- c("3GF", "4GF", "5GF", "6GF")
omegafa4itemsO_h <- c(omegafa4items3$omega_h, omegafa4items4$omega_h, omegafa4items5$omega_h, omegafa4items6$omega_h)
names(omegafa4itemsO_h) <- c("3GF", "4GF", "5GF", "6GF")
omegaMeasuresfa4 <- round(as.data.frame(rbind(omegafa4itemsECV, omegafa4itemsO_h)), 2)
rownames(omegaMeasuresfa4) <- c("ECV", "omega_h")
omegaMeasuresfa4$means <- round(rowMeans(omegaMeasuresfa4), 2)
omegaMeasuresfa4
alphafa4items <- alpha(ImQfa4ItemSet, check.keys = TRUE)
unidimfa4 <- round(alphafa4items$Unidim[[1]], 2)
alphaMeasuresfa4 <- data.frame(unidimfa4, round(alphafa4items$total, 2))
alphaMeasuresfa4
@
\end{scriptsize}

Now let's generate new correlation plots for the items in each factor and across factors as we did for Study 1.
\begin{scriptsize}
<<Step11, eval=TRUE, echo=TRUE, results='markup', fig.show="hide", tidy=TRUE, message=FALSE, warning=FALSE>>= 
ImQitemSetCor <- round(cor(ImQitemSet, use="pairwise"), 2)
Fcor <- round(cor(ImQfa1ItemSet, use="pairwise"), 2)
Ecor <- round(cor(ImQfa2ItemSet, use="pairwise"), 2)
Dcor <- round(cor(ImQfa3ItemSet, use="pairwise"), 2)
Ccor <- round(cor(ImQfa4ItemSet, use="pairwise"), 2)
@
<<Step12, eval=TRUE, echo=FALSE, results='hide', fig.show="hide", tidy=TRUE, message=FALSE, warning=FALSE>>= 
pdf(file=paste(filepathdata, "exploratorySample/ImQitemSetCor.pdf", sep=""))
cor.plot(ImQitemSetCor, numbers = FALSE, cex.axis=.8, n=51, main="", zlim = c(-1,1), cex = .5, show.legend=TRUE, n.legend=10, keep.par=TRUE, las=2)
dev.off()

pdf(file=paste(filepathdata, "exploratorySample/Fitems.pdf", sep=""))
cor.plot(Fcor, numbers = TRUE, cex.axis=.8, n=51, main="", zlim = c(-1,1), cex = .5, show.legend=TRUE, n.legend=10, keep.par=TRUE, las=2)
dev.off()

pdf(file=paste(filepathdata, "exploratorySample/Citems.pdf", sep=""))
cor.plot(Ccor, numbers = TRUE, cex.axis=.8, n=51, main="", zlim = c(-1,1), cex = .5, show.legend=TRUE, n.legend=10, keep.par=TRUE, las=2)
dev.off()

pdf(file=paste(filepathdata, "exploratorySample/Eitems.pdf", sep=""))
cor.plot(Ecor, numbers = TRUE, cex.axis=.8, n=51, main="", zlim = c(-1,1), cex = .5, show.legend=TRUE, n.legend=10, keep.par=TRUE, las=2)
dev.off()

pdf(file=paste(filepathdata, "exploratorySample/Ditems.pdf", sep=""))
cor.plot(Dcor, numbers = TRUE, cex.axis=.8, n=51, main="", zlim = c(-1,1), cex = .5, show.legend=TRUE, n.legend=10, keep.par=TRUE, las=2)
dev.off()
@
\end{scriptsize}

\begin{figure}
\begin{floatrow}
\ffigbox{\includegraphics[width=.45\textwidth]{data/exploratorySample/Eitems.pdf}}{\caption{Study 2 - 8 Emotional Valence items}\label{fig:EmotionalValenceExp}}
\ffigbox{\includegraphics[width=.45\textwidth]{data/exploratorySample/Citems.pdf}}{\caption{Study 2 - 8 Complexity items}\label{fig:ComplexityExp}}
\end{floatrow}
\end{figure}

\begin{figure}
\begin{floatrow}
\ffigbox{\includegraphics[width=.45\textwidth]{data/exploratorySample/Fitems.pdf}}{\caption{Study 2 - 11 Frequency items}\label{fig:FrequencyExp}}
\ffigbox{\includegraphics[width=.45\textwidth]{data/exploratorySample/Ditems.pdf}}{\caption{Study 2 - 6 Directedness items}\label{fig:DirectednessExp}}
\end{floatrow}
\end{figure}

\begin{figure}
\begin{center}
\ffigbox{\includegraphics[width=.75\textwidth]{data/exploratorySample/ImQitemSetCor.pdf}}{\caption{Study 2 - 33 ImQ items}\label{fig:ImQExp}}
\end{center}
\end{figure}

\section{Study 3: Validation Sample}

For this we use the SAPA sample.  See the ``Description of the SAPA Sample'' file for descriptives.  The first analysis is to evaluate the fit statistics for the ImQ items as a set --- using all 33.  First, we have to pull out the ImQ items and then determine the number of pairwise administrations between these items are in the data.

\begin{scriptsize}
<<Step13a, eval=TRUE, echo=TRUE, results='markup', fig.show="hide", tidy=TRUE, message=FALSE, warning=FALSE>>= 
load(paste(filepathdata, "validationSample/validationSampleImQ.rdata", sep=""))

# Clean for ages
validSample <- validSample[validSample$age > 12, ]
dim(validSample)

# Reverse code the ImQ complexity items for ease of interpretation
validSample$q_5074 <- 7-validSample$q_5074
validSample$q_5068 <- 7-validSample$q_5068
validSample$q_5093 <- 7-validSample$q_5093
validSample$q_5080 <- 7-validSample$q_5080
validSample$q_5086 <- 7-validSample$q_5086
validSample$q_5064 <- 7-validSample$q_5064
validSample$q_5078 <- 7-validSample$q_5078
validSample$q_5081 <- 7-validSample$q_5081

ImQitems <- ItemLists["ImQitems"][[1]]
ImQfrequency <- ItemLists["ImQfrequency"][[1]]
ImQemoValence <- ItemLists["ImQemoValence"][[1]]
ImQdirectedness <- ItemLists["ImQdirectedness"][[1]]
ImQcomplexity <- ItemLists["ImQcomplexity"][[1]]

validSampleImQ <- subset(validSample, select = c(ImQitems))

AdminsMean <- round(mean(describe(validSampleImQ)$n), 0)
AdminsMean <- prettyNum(AdminsMean, big.mark=",", scientific=F)

AdminsMedian <- round(median(describe(validSampleImQ)$n), 0)
AdminsMedian <- prettyNum(AdminsMedian, big.mark=",", scientific=F)

AdminsSD <- round(sd(describe(validSampleImQ)$n), 0)
AdminsSD <- prettyNum(AdminsSD, big.mark=",", scientific=F)

# The mean, median and sd of administrations of items
AdminsMean
AdminsMedian
AdminsSD

pwiseAdmins1 <- pairwiseDescribe(validSampleImQ)
pwiseAdmins <- prettyNum(pwiseAdmins1, big.mark=",", scientific=F)

pwiseAdminsMean <- round(pwiseAdmins1$mean, 0)
pwiseAdminsMean <- prettyNum(pwiseAdminsMean, big.mark=",", scientific=F)

pwiseAdminsMedian <- round(pwiseAdmins1$median, 0)
pwiseAdminsMedian <- prettyNum(pwiseAdminsMedian, big.mark=",", scientific=F)

pwiseAdminsSD <- round(pwiseAdmins1$sd, 0)
pwiseAdminsSD <- prettyNum(pwiseAdminsSD, big.mark=",", scientific=F)

# The mean, median and sd of pairwise administrations of items
pwiseAdminsMean
pwiseAdminsMedian
pwiseAdminsSD

# Get the sd of the correlations
validSampleImQcorCI <- corCi(validSampleImQ, n.iter = 1000, plot="FALSE")

# Mean sd of correlations in the validation sample.
meanSDofRsV <- mean(validSampleImQcorCI$sds)
meanSDofRsV
# Median sd of correlations in the validation sample.
medianSDofRsV <- median(validSampleImQcorCI$sds)
medianSDofRsV
# min sd of correlations in the validation sample.
maxSDofRsV <- max(validSampleImQcorCI$sds)
maxSDofRsV
# max sd of correlations in the validation sample.
minSDofRsV <- min(validSampleImQcorCI$sds)
minSDofRsV

# Exploratory sample standard errors for item correlations
pdf(file=paste(filepathdata, "validationSample/valCIs.pdf", sep=""))
plot(density(validSampleImQcorCI$sds), main="", ylab="Frequency", xlab="Standard error of item level correlations", xlim= c(0,.25), ylim = c(0,210))
dev.off()

# Get some demographic information
# percent female
femalePerc <- 100 * round((table(validSample$sex)/sum(table(validSample$sex)))[2], 3)
femalePerc

# percent U.S.
USperc <- 100 * round((table(validSample$country)/sum(table(validSample$country)))[1], 3)
USperc
# total U.S.
UScounttext <- prettyNum((table(validSample$country))[1], big.mark=",", scientific=F)
UScounttext

# number of countries with P>0
countryTable <- sort(table(validSample$country),decreasing=TRUE)
countryCount <- length(countryTable)
countryCount
# number of countries with P>49
countryCount50 <- subset(countryTable, countryTable > 49)
length(countryCount50)

fullSampleAge <- psych::describe(validSample[,"age"])
fullSampleAge
MeanAge <- fullSampleAge[3]
MeanAge
SDAge <- fullSampleAge[4]
SDAge
MedianAge <- fullSampleAge[5]
MedianAge
MinAge <- fullSampleAge[8]
MinAge
MaxAge <- fullSampleAge[9]
MaxAge
@
\end{scriptsize}

So, the ImQ items were administered a mean of \Sexpr{AdminsMean} times (median: \Sexpr{AdminsMedian} and SD: \Sexpr{AdminsSD}) and have mean and median pairwise administration counts of \Sexpr{pwiseAdminsMean} and \Sexpr{pwiseAdminsMedian}, respectively (SD: \Sexpr{pwiseAdminsSD}).

\begin{figure}
\begin{center}
\ffigbox{\includegraphics[width=.5\textwidth]{data/validationSample/valCIs.pdf}}{\caption{Distribution of SEs of the correlations between items in Study 3 (validation sample)}\label{fig:valCIs}}
\end{center}
\end{figure}

\begin{scriptsize}
<<Step13b, eval=TRUE, echo=TRUE, results='markup', tidy=TRUE, message=FALSE, warning=FALSE>>= 
getOption("width")
options(width=150)
nfacImQ <- nfactors(validSampleImQ, 10)
nfacImQ
@
\end{scriptsize}

Preliminary results suggest a 4 factor structure by VSS complexity 1, VSS complexity 2, squared root mean residual (crosses below .05), \& MAP (it is equivalent at 4 and 6 factors).  The RMSEA and values are not great at any level.  The BIC estimates do not converge.  The plot of the average item level complexity looks good at 3 and/or 4 factors.

Now let's look at the unidimensionality of each of our 4 scales (based on the factors) and the factor loadings to see if they could be improved further.

\begin{scriptsize}
<<Step14, eval=TRUE, echo=TRUE, results='markup', fig.show="hide", tidy=TRUE, message=FALSE, warning=FALSE>>= 
alphaImQ1 <- alpha(subset(validSample, select = c(ImQfrequency)), check.keys=TRUE)
alphaImQ1
omegaImQ1 <- omega(subset(validSample, select = c(ImQfrequency)), plot = FALSE)
omegaImQ1$omega_h
# I think item q_5084/ImQ_F8R should be dropped.

alphaImQ2 <- alpha(subset(validSample, select = c(ImQemoValence)), check.keys=TRUE)
alphaImQ2
omegaImQ2 <- omega(subset(validSample, select = c(ImQemoValence)), plot = FALSE)
omegaImQ2$omega_h
# I think item q_5079/ImQ_EV6 should be dropped.

alphaImQ3 <- alpha(subset(validSample, select = c(ImQdirectedness)), check.keys=TRUE)
alphaImQ3
omegaImQ3 <- omega(subset(validSample, select = c(ImQdirectedness)), plot = FALSE)
omegaImQ3$omega_h
# Retain all items for now.

alphaImQ4 <- alpha(subset(validSample, select = c(ImQcomplexity)), check.keys=TRUE)
alphaImQ4
omegaImQ4 <- omega(subset(validSample, select = c(ImQcomplexity)), plot = FALSE)
omegaImQ4$omega_h
# I think item q_5064/ImQ_C1R should be dropped.

validSampleImQv2 <- subset(validSampleImQ, select = -c(q_5064, q_5079, q_5084))

ImQfa4v2 <- fa(validSampleImQv2, 4)
ImQfa4v2sorted <- fa.sort(ImQfa4v2)
ImQfa4v2sorted
# I think items q_5093, q_5061, and q_5078 should be dropped due to cross loadings.  
# Also q_5077 because it doesn't have a large primary loading.
@
\end{scriptsize}

It appears as though the structure of the scales would be improved if we dropped seven items.  So let's see how things look with those items removed.
\begin{scriptsize}
<<Step15, eval=TRUE, echo=TRUE, results='markup', tidy=TRUE, message=FALSE, warning=FALSE>>= 
validSampleImQadj <- subset(validSampleImQ, select = -c(q_5064, q_5079, q_5084, q_5093, q_5061, q_5077, q_5078))
nfacImQadj <- nfactors(validSampleImQadj, 10)
nfacImQadj
@
\end{scriptsize}

Four factors looks quite good.  VSS1 now shows 3 factors is slight better but 4 and 5 are also good.  VSS2, MAP, and average item complexity show 4.  BIC, RMSEA, and SRMR do not converge but the plots for those statistics all trail off at 4 factors (see figure).

Now, evaluate the psychometric properties of the full item set.
\begin{scriptsize}
<<Step16, eval=TRUE, echo=TRUE, results='markup', tidy=TRUE, message=FALSE, warning=FALSE>>= 
SAPAnames <- colnames(validSampleImQadj)
colnames(validSampleImQadj) <- c("D1", "F1", "EV1", "F2", "EV2", "C1", "F3", "EV3", "EV4", "D2", "F4", "C2", "EV5", "F5", "C3", "C4", "F6", "EV6", "D3", "C5", "F7", "F8", "D4", "EV7", "D5", "F9")
omegaImQw4 <- omega(validSampleImQadj, nfactors=4, sl=FALSE)
omegaImQw4
omegaImQw3 <- omega(validSampleImQadj, plot=FALSE)
omegaImQw3
omegaImQw5 <- omega(validSampleImQadj, nfactors=5, plot=FALSE)
omegaImQw5
@
\end{scriptsize}

So, the omega solutions point to a fairly clear 4 factor solution though it appears that the first factor is collapsing onto the first factor --- in other words, the first sub-factor is indistinguishable from the general factor.  Note that the omega hierarchical value for the 4 factor solution (\Sexpr{omegaImQw4$omega_h}) suggests that it is reasonably unidimensional, but also that there is some evidence of lower level structure.  A good question for validation research will be to evaluate the amount of incremental validity provided by the remaining three factors (after the first).

Let's see what happens when we conduct a 4 factor EFA on the reduced item set (only 26 items).
\begin{scriptsize}
<<Step17, eval=TRUE, echo=TRUE, results='markup', fig.show="hide", tidy=TRUE, message=FALSE, warning=FALSE>>= 
options(width=150)
ImQfa4 <- fa(validSampleImQadj, 4)
ImQfa4$Phi
ImQfa4sorted <- fa.sort(ImQfa4)

ItemInfo2 <- ItemInfo[SAPAnames,]
rownames(ItemInfo2) <- c("D1", "F1", "EV1", "F2", "EV2", "C1", "F3", "EV3", "EV4", "D2", "F4", "C2", "EV5", "F5", "C3", "C4", "F6", "EV6", "D3", "C5", "F7", "F8", "D4", "EV7", "D5", "F9")

ImQfa4sorted <- data.frame(ItemInfo2[rownames(ImQfa4sorted$loadings),2], round(ImQfa4sorted$loadings[,1:4], 2))
ImQfa4sorted
@
\end{scriptsize}
All of the items have primary factor loadings above (abs value of) 0.3 and none of the items have high secondary loadings.

We also want to know the internal consistency values for these 4 scales.
\begin{scriptsize}
<<Step18, eval=TRUE, echo=TRUE, results='markup', fig.show="hide", tidy=TRUE, message=FALSE, warning=FALSE>>= 
options(width=120)
colnames(validSampleImQadj) <- SAPAnames
ItemLists[c("ImQitems", "ImQfrequency", "ImQemoValence", "ImQcomplexity", "ImQdirectedness")] <- NULL

ImQitemLists <- list(
ImQitems = c("q_5063", "q_5082", "q_5073", "q_5087", "q_5069", "q_5076", "q_5066", "q_5088", "q_5092", "q_5090", "q_5067", "q_5070", "q_5075", "q_5071", "q_5065", "q_5083", "q_5086", "q_5074", "q_5081", "q_5068", "q_5080", "q_5062", "q_5085", "q_5089", "q_5072", "q_5091"),
ImQfrequency = c("q_5063", "q_5082", "q_5073", "q_5087", "q_5069", "q_5076", "q_5066", "q_5088", "q_5092"), 
ImQemoValence = c("q_5090", "q_5067", "q_5070", "q_5075", "q_5071", "q_5065", "q_5083"), 
ImQcomplexity = c("q_5086", "q_5074", "q_5081", "q_5068", "q_5080"), 
ImQdirectedness = c("q_5062", "q_5085", "q_5089", "q_5072", "q_5091"))

ItemLists <- c(ItemLists, ImQitemLists)

alphaImQall <- alpha(subset(validSampleImQadj, select = c(ItemLists["ImQitems"][[1]])), check.keys=TRUE)
alphaImQall
omegaImQall <- omega(subset(validSampleImQadj, select = c(ItemLists["ImQitems"][[1]])), plot = FALSE)
omegaImQall$omega_h

alphaImQfrequency <- alpha(subset(validSampleImQadj, select = c(ItemLists["ImQfrequency"][[1]])), check.keys=TRUE)
alphaImQfrequency
omegaImQfrequency <- omega(subset(validSampleImQadj, select = c(ItemLists["ImQfrequency"][[1]])), plot = FALSE)
omegaImQfrequency$omega_h

alphaImQemoValence <- alpha(subset(validSampleImQadj, select = c(ItemLists["ImQemoValence"][[1]])), check.keys=TRUE)
alphaImQemoValence
omegaImQemoValence <- omega(subset(validSampleImQadj, select = c(ItemLists["ImQemoValence"][[1]])), plot = FALSE)
omegaImQemoValence$omega_h

alphaImQdirectedness <- alpha(subset(validSampleImQadj, select = c(ItemLists["ImQcomplexity"][[1]])), check.keys=TRUE)
alphaImQdirectedness
omegaImQdirectedness <- omega(subset(validSampleImQadj, select = c(ItemLists["ImQcomplexity"][[1]])), plot = FALSE)
omegaImQdirectedness$omega_h

alphaImQcomplexity <- alpha(subset(validSampleImQadj, select = c(ItemLists["ImQdirectedness"][[1]])), check.keys=TRUE)
alphaImQcomplexity
omegaImQcomplexity <- omega(subset(validSampleImQadj, select = c(ItemLists["ImQdirectedness"][[1]])), plot = FALSE)
omegaImQcomplexity$omega_h

omegaVec <- c(omegaImQall$omega_h, omegaImQfrequency$omega_h, omegaImQemoValence$omega_h, omegaImQcomplexity$omega_h, omegaImQdirectedness$omega_h)

alphaVec <- c(alphaImQall$total$raw_alpha, alphaImQfrequency$total$raw_alpha, alphaImQemoValence$total$raw_alpha, alphaImQcomplexity$total$raw_alpha, alphaImQdirectedness$total$raw_alpha)

ICtable <- data.frame(alphaVec, omegaVec)
colnames(ICtable) <- c("alpha", "omega")
rownames(ICtable) <- c("All 26 items", "Frequency", "Emotional Valence", "Complexity", "Directedness")
ICtable
@
\end{scriptsize}

Now we will consider how the ImQ scales correlate with the Big Five (BFI and SPI-5), the lower-level personality factors (SPI-27), and the ICAR scores.  

\begin{scriptsize}
<<Step19, eval=TRUE, echo=TRUE, results='markup', fig.show="hide", tidy=TRUE, message=FALSE, warning=FALSE>>=
# Make keys for ImQ scales
KeyItemLists <- ImQitemLists

dataSet <- validSample
keys.list <- list()
n <- 0
for (i in 1:length(KeyItemLists)) {
  n <- n + 1
  list <- KeyItemLists[[i]]
  alpha1 <- suppressWarnings(alpha(subset(dataSet, select = c(KeyItemLists[i][[1]])), check.keys=TRUE))
  keys <- alpha1$keys
  keyed <- list()
  for (j in 1:length(keys)) {
    keyed <- append(keyed, ifelse(keys[j] == 1, list[j], 
                                  paste("-", list[j], sep="")))
    }
  assign(paste("keys_", names(KeyItemLists)[i], sep=""), unlist(keyed))
  keys.list[[n]] <- unlist(keyed)
  names(keys.list)[[n]] <- names(KeyItemLists[i])
  
rm(list, alpha1, j, keys, keyed)
}

keys.list.ImQ <- keys.list

# It appears we need to flip ImQcomplexity

load("/Users/dmc174/Drive/other/analyses/SAPA/items/Keys.rdata")

keys.list.more <- keys.list[c("BFI1_Extraversion8", "BFI1_Agreeableness9", "BFI1_Conscientiousness9", "BFI1_Neuroticism8", "BFI1_Openness10", "SPI_135_27_5_Agree", "SPI_135_27_5_Consc", "SPI_135_27_5_Extra", "SPI_135_27_5_Neuro", "SPI_135_27_5_Open", "SPI_135_27_5_Compassion", "SPI_135_27_5_Irritability", "SPI_135_27_5_Sociability", "SPI_135_27_5_WellBeing", "SPI_135_27_5_SensationSeeking", "SPI_135_27_5_Anxiety", "SPI_135_27_5_Honesty", "SPI_135_27_5_Industry", "SPI_135_27_5_Intellect", "SPI_135_27_5_Creativity", "SPI_135_27_5_Impulsivity", "SPI_135_27_5_AttentionSeeking", "SPI_135_27_5_Order", "SPI_135_27_5_Authoritarianism", "SPI_135_27_5_Charisma", "SPI_135_27_5_Trust", "SPI_135_27_5_Humor", "SPI_135_27_5_EmotionalExpressiveness", "SPI_135_27_5_ArtAppreciation", "SPI_135_27_5_Introspection", "SPI_135_27_5_Perfectionism", "SPI_135_27_5_SelfControl", "SPI_135_27_5_Conformity", "SPI_135_27_5_Adaptability", "SPI_135_27_5_EasyGoingness", "SPI_135_27_5_EmotionalStability", "SPI_135_27_5_Conservatism", "ICAR60", "LNiq", "MRiq", "R3Diq", "VRiq")]

keys.list <- c(keys.list.ImQ, keys.list.more)
@
\end{scriptsize}

Get the IRT scores for the SPI 27 and ICAR.
\begin{scriptsize}
<<Step20, eval=TRUE, echo=TRUE, results='markup', fig.show="hide", tidy=TRUE, message=FALSE, warning=FALSE>>=
##########
# Get the SPI27 IRT scores
##########
load("/Users/dmc174/Drive/other/SAPAcode/SPIscoring/tools/IRTinfoSPI27.rdata")

# IRT score
dataSet <- subset(validSample, select = c(orderForItems))
SPIirtScores <- matrix(nrow=dim(dataSet)[1], ncol=27)
SPIirtSEs <- matrix(nrow=dim(dataSet)[1], ncol=27)
for (i in 1:length(IRToutputSPI27)) {
  data <- subset(dataSet, select = c(rownames(IRToutputSPI27[[i]]$irt$difficulty[[1]])))
  calibrations <- IRToutputSPI27[[i]]
  scored <- scoreIrt(calibrations, data, keys = NULL, cut = 0)
  irt.data <- irt.se(calibrations, score = as.matrix(scored[,1]))
  TScoring <- (irt.data[,"scores"]-thetaNormsMeans[i])/thetaNormsSDs[i]
  TScores <- TScoring*10+50
  SPIirtScores[,i] <- TScores
  TScoreSEs <- irt.data[,"se"]*10
  SPIirtSEs[,i] <- TScoreSEs
  rm(TScores, TScoring, TScoreSEs, scored, calibrations, data)
}
SPIirtScores <- as.data.frame(SPIirtScores)
colnames(SPIirtScores) <- scaleNames
SPIirtSEs <- as.data.frame(SPIirtSEs)
colnames(SPIirtSEs) <- scaleNames
rm(IRToutputSPI27, thetaNormsMeans, thetaNormsSDs, scaleNames)
SPIirtScores$SPI27_Irritability <- 100-SPIirtScores$SPI27_Irritability
SPIirtScores$SPI27_Sociability <- 100-SPIirtScores$SPI27_Sociability
SPIirtScores$SPI27_Honesty <- 100-SPIirtScores$SPI27_Honesty
SPIirtScores$SPI27_Industry <- 100-SPIirtScores$SPI27_Industry
SPIirtScores$SPI27_Order <- 100-SPIirtScores$SPI27_Order
SPIirtScores$SPI27_ArtAppreciation <- 100-SPIirtScores$SPI27_ArtAppreciation
SPIirtScores$SPI27_Adaptability <- 100-SPIirtScores$SPI27_Adaptability

describe(SPIirtScores, fast=TRUE)
describe(SPIirtSEs, fast=TRUE)
# Note the extent to which this sample differs from the norming sample.

##########
# Get the ICAR IRT scores
##########
load("/Users/dmc174/Drive/other/SAPAcode/SPIscoring/tools/IRTinfoICAR.rdata")

# IRT score
dataSet <- subset(validSample, select = c(orderForItems))
ICARirtScores <- matrix(nrow=dim(dataSet)[1], ncol=5)
ICARirtSEs <- matrix(nrow=dim(dataSet)[1], ncol=5)
for (i in 1:length(IRToutputICAR)) {
  data <- subset(dataSet, select = c(names(IRToutputICAR[[i]]$irt$difficulty[[1]])))
  calibrations <- IRToutputICAR[[i]]
  scored <- scoreIrt(calibrations, data, keys = NULL, cut = 0)
  irt.data <- irt.se(calibrations, score = as.matrix(scored[,1]))
  TScoring <- (irt.data[,"scores"]-thetaNormsMeans[i])/thetaNormsSDs[i]
  TScores <- TScoring*10+50
  ICARirtScores[,i] <- TScores
  TScoreSEs <- irt.data[,"se"]*10
  ICARirtSEs[,i] <- TScoreSEs
  rm(TScores, TScoring, TScoreSEs, scored, calibrations, data)
}
ICARirtScores <- as.data.frame(ICARirtScores)
colnames(ICARirtScores) <- scaleNames
ICARirtSEs <- as.data.frame(ICARirtSEs)
colnames(ICARirtSEs) <- scaleNames
rm(IRToutputICAR, thetaNormsMeans, thetaNormsSDs, scaleNames)

describe(ICARirtScores, fast=TRUE)
describe(ICARirtSEs, fast=TRUE)
# Note the extent to which this sample differs from the norming sample.
@
\end{scriptsize}

Get scale scores for all the relevant scales.
\begin{scriptsize}
<<Step21, eval=TRUE, echo=TRUE, results='markup', fig.show="hide", tidy=TRUE, message=FALSE, warning=FALSE>>=
PolyItemsToBeScored <- subset(validSample, select = c(unique(unlist(ItemLists[names(keys.list)[1:42]]))))
PolyKeys <- make.keys(PolyItemsToBeScored, keys.list[1:42])

# First we want to get the person-level scores
PolyScaleScores <- scoreItems(keys = PolyKeys, items = PolyItemsToBeScored, impute="none")
PolyScaleScores_person <- PolyScaleScores$scores
rm(PolyScaleScores)

# Now get the scores again using scoreOverlap
PolyScaleScores <- scoreOverlap(PolyKeys, PolyItemsToBeScored, impute="none")
PolyAlphas <- round(t(PolyScaleScores$alpha), 2)

# Here compare sum scores and IRT for the SPI in case any one questions.
compareIRTandSumSPI <- data.frame(SPIirtScores, subset(PolyScaleScores_person, select = c(SPI_135_27_5_Compassion:SPI_135_27_5_Conservatism)))
pdf(file=paste(filepathdata, "validationSample/compareIRTandSum.pdf", sep=""))
cor.plot(compareIRTandSumSPI, numbers = FALSE, cex.axis=.8, n=51, main="IRT v sum score comparison", zlim = c(-1,1), cex = .5, show.legend=TRUE, n.legend=10, keep.par=TRUE, las=2)
dev.off()

validSampleScaleScores <- data.frame(subset(validSample, select = c(RID:RaceNotReported)), subset(PolyScaleScores_person, select = c(ImQitems:SPI_135_27_5_Open)), SPIirtScores, ICARirtScores)
@
\end{scriptsize}

Now reduce the data frame down to the variables we want to include in correlational/regression analyses.  Then show the first-order correlations.
\begin{scriptsize}
<<Step22, eval=TRUE, echo=TRUE, results='markup', fig.show="hide", tidy=TRUE, message=FALSE, warning=FALSE>>=
validSampleScaleScoresCor <- validSampleScaleScores

# Make sex binary for now
validSampleScaleScoresCor$sex <- ifelse(validSampleScaleScoresCor$sex == 3, NA, validSampleScaleScoresCor$sex) 

# Drop variables for which correlations are nonsensical
validSampleScaleScoresCor <- subset(validSampleScaleScoresCor, select = -c(RID, marstatus, country, state, p1occ:occIncomeEst, military, insurance, BMI, diabetes, veggie, prescriptions, ERwhy, cope, domain1, domain2, domain3, ethnic, orientation, SES, p1SES, p2SES, pedu, poccPrestige, poccIncome, pSES))

# Be sure that the demo variables are in the most logical order
validSampleScaleScoresCor <- subset(validSampleScaleScoresCor, select = c(age:ZBMI, Zedu, ZoccPrestige, ZoccIncomeEst, ZSES, Zp1edu, Zp1occPrestige, Zp1occIncomeEst, Zp1SES, Zp2edu, Zp2occPrestige, Zp2occIncomeEst, Zp2SES, Zpedu, ZpoccPrestige, ZpoccIncome, ZpSES, White:ImQdirectedness, BFI1_Agreeableness9, BFI1_Conscientiousness9, BFI1_Extraversion8, BFI1_Neuroticism8:VRiq))

validSampleScaleScoresCorNotCI <- cor(validSampleScaleScoresCor, use="pairwise")

validSampleScaleScoresCorCI <- cor.ci(validSampleScaleScoresCor, p=.05, n.iter = 100, overlap=FALSE, plot = FALSE, poly = FALSE)
@
\end{scriptsize}

\begin{scriptsize}
<<Step23, eval=TRUE, echo=FALSE, results='hide', fig.show="hide", tidy=TRUE, message=FALSE, warning=FALSE>>=
pdf(file=paste(filepathdata, "validationSample/largeCorPlot.pdf", sep=""))
cor.plot(validSampleScaleScoresCorNotCI, numbers = FALSE, cex.axis=.35, n=51, main="", zlim = c(-1,1), cex = .5, show.legend=FALSE, n.legend=10, keep.par=TRUE, las=2)
dev.off()

# Now make some more useful smaller plots
pdf(file=paste(filepathdata, "validationSample/ImQCorPlot.pdf", sep=""))
cor.plot(validSampleScaleScoresCorNotCI[c(41:45),c(41:45)], numbers = TRUE, cex.axis=1, n=51, main="", zlim = c(-1,1), cex = 1, show.legend=FALSE, n.legend=10, keep.par=TRUE, las=2)
dev.off()

pdf(file=paste(filepathdata, "validationSample/ImQdemoCorPlot.pdf", sep=""))
cor.plot(validSampleScaleScoresCorNotCI[c(1:20,41:45),c(1:20,41:45)], numbers = TRUE, cex.axis=1, n=51, main="", zlim = c(-1,1), cex = .5, show.legend=FALSE, n.legend=10, keep.par=TRUE, las=2)
dev.off()

pdf(file=paste(filepathdata, "validationSample/ImQSESCorPlot.pdf", sep=""))
cor.plot(validSampleScaleScoresCorNotCI[c(21:36,41:45),c(21:36,41:45)], numbers = TRUE, cex.axis=1, n=51, main="", zlim = c(-1,1), cex = .5, show.legend=FALSE, n.legend=10, keep.par=TRUE, las=2)
dev.off()

pdf(file=paste(filepathdata, "validationSample/ImQICARCorPlot.pdf", sep=""))
cor.plot(validSampleScaleScoresCorNotCI[c(83:87, 41:45), c(83:87, 41:45)], numbers = TRUE, cex.axis=1, n=51, main="", zlim = c(-1,1), cex = .5, show.legend=FALSE, n.legend=10, keep.par=TRUE, las=2)
dev.off()

pdf(file=paste(filepathdata, "validationSample/ImQBigFiveCorPlot.pdf", sep=""))
cor.plot(validSampleScaleScoresCorNotCI[c(41:55), c(41:55)], numbers = TRUE, cex.axis=1, n=51, main="", zlim = c(-1,1), cex = .5, show.legend=FALSE, n.legend=10, keep.par=TRUE, las=2)
dev.off()

pdf(file=paste(filepathdata, "validationSample/BigFiveCorPlot.pdf", sep=""))
cor.plot(validSampleScaleScoresCorNotCI[c(46:55), c(46:55)], numbers = TRUE, cex.axis=.8, n=51, main="", zlim = c(-1,1), cex = .5, show.legend=TRUE, n.legend=10, keep.par=TRUE, las=2)
dev.off()

pdf(file=paste(filepathdata, "validationSample/ImQSPICorPlot.pdf", sep=""))
cor.plot(validSampleScaleScoresCorNotCI[c(41:46,56:82), c(41:46,56:82)], numbers = TRUE, cex.axis=.7, n=51, main="", zlim = c(-1,1), cex = .3, show.legend=TRUE, n.legend=10, keep.par=TRUE, las=2)
dev.off()

pdf(file=paste(filepathdata, "validationSample/SPICorPlot.pdf", sep=""))
cor.plot(validSampleScaleScoresCorNotCI[c(51:82), c(51:82)], numbers = TRUE, cex.axis=.7, n=51, main="", zlim = c(-1,1), cex = .3, show.legend=TRUE, n.legend=10, keep.par=TRUE, las=2)
dev.off()

pdf(paste(filepathdata, "validationSample/corCIs.pdf", sep = ""), onefile = TRUE)
plot(density(validSampleScaleScoresCorCI$sds), main="", ylab="Frequency", xlab="Standard error of scale level correlations", xlim= c(0,.2), ylim = c(0,200))
dev.off()
@
\end{scriptsize}

Now we will print many of the correlation plots.  Note that the last one shows the distribution of the standard errors of the correlations.  An alternative method for showing the significance of the correlations would be to state the confidence intervals for each single point estimate.  The code below shows the correlations and CIs for the (large) subset that relates to the ImQ scales.
\begin{scriptsize}
<<Step24, eval=TRUE, echo=TRUE, results='markup', fig.show="hide", tidy=TRUE, message=FALSE, warning=FALSE>>=
corsWithCIs <- round(data.frame(validSampleScaleScoresCorCI$ci[2661:2880, 2], validSampleScaleScoresCorCI$means[2661:2880], validSampleScaleScoresCorCI$ci[2661:2880, 4:5]), 2)
colnames(corsWithCIs) <- c("lowerCI", "ptEst", "upperCI", "pvalue")
corsWithCIs
write.csv(corsWithCIs, file=paste(filepathdata, "validationSample/corsWithCIs.csv", sep = ""))
@
\end{scriptsize}

\begin{figure}
\begin{center}
\ffigbox{\includegraphics[width=.95\textwidth]{data/validationSample/ImQCorPlot.pdf}}{\caption{Correlations among the ImQ scales}\label{fig:ImQCorPlot}}
\end{center}
\end{figure}

\begin{figure}
\begin{center}
\ffigbox{\includegraphics[width=.95\textwidth]{data/validationSample/ImQdemoCorPlot.pdf}}{\caption{Correlations among (some) demographics and health variables with the ImQ scales}\label{fig:ImQdemoCorPlot}}
\end{center}
\end{figure}

\begin{figure}
\begin{center}
\ffigbox{\includegraphics[width=.95\textwidth]{data/validationSample/ImQSESCorPlot.pdf}}{\caption{Correlations among SES status variables and the ImQ scales}\label{fig:ImQdemoCorPlot}}
\end{center}
\end{figure}

\begin{figure}
\begin{center}
\ffigbox{\includegraphics[width=.95\textwidth]{data/validationSample/ImQICARCorPlot.pdf}}{\caption{Correlations among ICAR variables and the ImQ scales}\label{fig:ImQdemoCorPlot}}
\end{center}
\end{figure}

\begin{figure}
\begin{center}
\ffigbox{\includegraphics[width=.95\textwidth]{data/validationSample/ImQBigFiveCorPlot.pdf}}{\caption{Correlations among Big Five variables (BFI \& SPI) and the ImQ scales}\label{fig:ImQdemoCorPlot}}
\end{center}
\end{figure}

\begin{figure}
\begin{center}
\ffigbox{\includegraphics[width=.95\textwidth]{data/validationSample/ImQSPICorPlot.pdf}}{\caption{Correlations among the 27 SPI factors and the ImQ scales}\label{fig:ImQdemoCorPlot}}
\end{center}
\end{figure}

\begin{figure}
\begin{center}
\ffigbox{\includegraphics[width=.75\textwidth]{data/validationSample/corCIs.pdf}}{\caption{Distribution of SEs of the correlations}\label{fig:corCIs}}
\end{center}
\end{figure}

\section{Note about reproducibility of this file}

The file was created in \LaTeX and last compiled on \today{} using \Sexpr{R.version$version.string} \citep{R}, RStudio version 0.99.893, and \Sexpr{packageDescription("psych")[[1]]} version \Sexpr{packageDescription("psych")[[2]]} \citep{Rpsych} (other \texttt{R} packages may have been used sparingly -- see the code below).  Users can re-create these analyses by using the version of this file ending with the extension `.rnw' (email the first author if you do not already have access to this version).  Reproduction of the pdf version requires compiling the rnw version in \texttt{R} or similar \texttt{R} based software such as RStudio.  Exact reproduction will require a few modifications of the rnw file to match the user settings.  Specifically, these include (1) changing a few lines of code for loading and saving data in order to map to the user's directory and (2) dealing with the citations made through the document denoted by the citation tag \textbackslash cite (these could be linked to a locally hosted citation file or simply deleted).  Note that the rnw file was written with knitr tags for handling code chunks and for typesetting LaTeX into PDF using pdfLaTeX.  An easy alternative for reproducing the code is to pull the code chunks out of the `rnw' file and run them manually in any \texttt{R} interface.

\bibliographystyle{apalike}
\bibliography{/Users/dmc174/Drive/resources/dmcCL.bib}

\end{document}